21 research outputs found

    Preface to the special issue: adaptive and learning agents

    Get PDF

    Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence

    Get PDF
    Multi-objective problems with correlated objectives are a class of problems that deserve specific attention. In contrast to typical multi-objective problems, they do not require the identification of trade-offs between the objectives, as (near-) optimal solutions for any objective are (near-) optimal for every objective. Intelligently combining the feedback from these objectives, instead of only looking at a single one, can improve optimization. This class of problems is very relevant in reinforcement learning, as any single-objective reinforcement learning problem can be framed as such a multi-objective problem using multiple reward shaping functions. After discussing this problem class, we propose a solution technique for such reinforcement learning problems, called adaptive objective selection. This technique makes a temporal difference learner estimate the Q-function for each objective in parallel, and introduces a way of measuring confidence in these estimates. This confidence metric is then used to choose which objective's estimates to use for action selection. We show significant improvements in performance over other plausible techniques on two problem domains. Finally, we provide an intuitive analysis of the technique's decisions, yielding insights into the nature of the problems being solved

    A conceptual framework for externally-influenced agents: an assisted reinforcement learning review

    Get PDF
    A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, while reviewing externally-influenced methods, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering collaboration by classifying and comparing various methods that use external information in the learning process. The proposed taxonomy details the relationship between the external information source and the learner agent, highlighting the process of information decomposition, structure, retention, and how it can be used to influence agent learning. As well as reviewing state-of-the-art methods, we identify current streams of reinforcement learning that use external information in order to improve the agent’s performance and its decision-making process. These include heuristic reinforcement learning, interactive reinforcement learning, learning from demonstration, transfer learning, and learning from multiple sources, among others. These streams of reinforcement learning operate with the shared objective of scaffolding the learner agent. Lastly, we discuss further possibilities for future work in the field of assisted reinforcement learning systems. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature

    A novel adaptive weight selection algorithm for multi-objective multi-agent reinforcement learning

    Get PDF
    To solve multi-objective problems, multiple reward signals are often scalarized into a single value and further processed using established single-objective problem solving techniques. While the field of multi-objective optimization has made many advances in applying scalarization techniques to obtain good solution trade-offs, the utility of applying these techniques in the multi-objective multi-agent learning domain has not yet been thoroughly investigated. Agents learn the value of their decisions by linearly scalarizing their reward signals at the local level, while acceptable system wide behaviour results. However, the non-linear relationship between weighting parameters of the scalarization function and the learned policy makes the discovery of system wide trade-offs time consuming. Our first contribution is a thorough analysis of well known scalarization schemes within the multi-objective multi-agent reinforcement learning setup. The analysed approaches intelligently explore the weight-space in order to find a wider range of system trade-offs. In our second contribution, we propose a novel adaptive weight algorithm which interacts with the underlying local multi-objective solvers and allows for a better coverage of the Pareto front. Our third contribution is the experimental validation of our approach by learning bi-objective policies in self-organising smart camera networks. We note that our algorithm (i) explores the objective space faster on many problem instances, (ii) obtained solutions that exhibit a larger hypervolume, while (iii) acquiring a greater spread in the objective space

    Local search and restart strategies for satisfiability solving in fuzzy logics

    Get PDF
    Satisfiability solving in fuzzy logics is a subject that has not been researched much, certainly compared to satisfiability in propositional logics. Yet, fuzzy logics are a powerful tool for modelling complex problems. Recently, we proposed an optimization approach to solving satisfiability in fuzzy logics and compared the standard Covariance Matrix Adaptation Evolution Strategy algorithm (CMA-ES) with an analytical solver on a set of benchmark problems. Especially on more finegrained problems did CMA-ES compare favourably to the analytical approach. In this paper, we evaluate two types of hillclimber in addition to CMA-ES, as well as restart strategies for these algorithms. Our results show that a population-based hillclimber outperforms CMA-ES on the harder problem class
    corecore